Context Based Indexing On Synonym System Using Hierarchical Clustering In Web Mining

نویسنده

  • D. Yitong Wang
چکیده

Now a days, the World Wide Web is the collection of large amount of information which is increasing day by day. For this increasing amount of information, there is a need for efficient and effective indexing structure. Indexing in search engines has become the major issue for improving the performance of Web search engines, so that the most relevant web documents are retrieved in minimum possible time. For this a new indexing mechanism in search engine is proposed which is based on indexing the synonym terms of the web documents, a synonym term which have multiple context with same meaning of the web documents. The indexing is performed on the bases of hierarchical clustering method which clustered the similar term documents into the same cluster and these clusters are clubbed together to form mega cluster on the basis of synonym term. With the similarity of clusters, it will optimize the search process by forming the different levels of hierarchy. Finally, it will give fast and relevant retrieval of web documents to the user. Keywords-Context, Synonym; Hierarchical Clustering; Indexing; Ontology Repository. __________________________________________________*****_________________________________________________

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Indexing Technique for Web Documents using Hierarchical Clustering

The information on the WWW is growing at an exponential rate; therefore, search engines are required to index the downloaded Web documents more efficiently. Web mining techniques like clustering can be used for this purpose. In this paper, a novel technique to index the documents is being proposed that not only indexes the documents more efficiently but also uses hierarchical clustering to keep...

متن کامل

Clustering Web Page Search Results: a Full Text Based Approach

With so much information available on the web, looking for relevant documents on the Internet has become a tough task. In this paper we present as approach which is a match between a query-based Google and a category-based Yahoo. WISE is a web page hierarchical clustering system, language and topic independent, supported by a graph based overlapping algorithm that groups web relevant pages into...

متن کامل

Evaluation of text clustering methods using wordnet

The increasing number of digitized texts presently available notably on the Web has developed an acute need in text mining techniques. Clustering systems are used more and more often in text mining, especially to analyze texts and to extract knowledge they contain. With the availability of the vast amount of clustering algorithms and techniques, it becomes highly confusing to a user to choose t...

متن کامل

Improving Information Retrieval Using Document Clusters and Semantic Synonym Extraction

Document clustering has been investigated for use in a number of different areas of text mining and information retrieval. Initially, document clustering was investigated for improving the precision or recall in information retrieval systems and as an efficient way of finding the nearest neighbors of a document. More recently, clustering has been proposed for use in browsing a collection of doc...

متن کامل

Developing a Recommendation Framework for Tourist by Mining Geo-tag Photos (Case Study Tehran District 6)

With the increasing popularity of sharing media on social networks and facilitating access to location technologies, such as Global Positioning System (GPS), people are more interested to share their own photos and videos. The world wide web users are no longer the sole consumer but they are producers of information also, hence a wealth of information are available on web 2.0 applications. The ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017